Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 122636 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 18.7 MiB |
| Average record size in memory | 160.0 B |
Variable types
| Numeric | 11 |
|---|---|
| Categorical | 9 |
id has a high cardinality: 122636 distinct values | High cardinality |
df_index is highly correlated with year_account_created | High correlation |
days_from_first_active_until_booking is highly correlated with days_from_account_created_until_first_booking and 2 other fields | High correlation |
days_from_account_created_until_first_booking is highly correlated with days_from_first_active_until_booking and 2 other fields | High correlation |
day_first_booking is highly correlated with days_from_first_active_until_booking and 2 other fields | High correlation |
day_of_week_first_booking is highly correlated with days_from_first_active_until_booking and 2 other fields | High correlation |
year_account_created is highly correlated with df_index | High correlation |
df_index is highly correlated with year_account_created | High correlation |
days_from_first_active_until_booking is highly correlated with days_from_account_created_until_first_booking and 2 other fields | High correlation |
days_from_account_created_until_first_booking is highly correlated with days_from_first_active_until_booking and 2 other fields | High correlation |
day_first_booking is highly correlated with days_from_first_active_until_booking and 2 other fields | High correlation |
day_of_week_first_booking is highly correlated with days_from_first_active_until_booking and 2 other fields | High correlation |
year_account_created is highly correlated with df_index | High correlation |
df_index is highly correlated with year_account_created | High correlation |
days_from_first_active_until_booking is highly correlated with days_from_account_created_until_first_booking and 1 other fields | High correlation |
days_from_account_created_until_first_booking is highly correlated with days_from_first_active_until_booking and 1 other fields | High correlation |
day_first_booking is highly correlated with day_of_week_first_booking | High correlation |
day_of_week_first_booking is highly correlated with days_from_first_active_until_booking and 2 other fields | High correlation |
year_account_created is highly correlated with df_index | High correlation |
df_index is highly correlated with country_destination and 4 other fields | High correlation |
signup_flow is highly correlated with affiliate_channel and 1 other fields | High correlation |
affiliate_channel is highly correlated with signup_flow and 2 other fields | High correlation |
first_affiliate_tracked is highly correlated with affiliate_channel | High correlation |
signup_app is highly correlated with signup_flow and 1 other fields | High correlation |
country_destination is highly correlated with df_index and 2 other fields | High correlation |
days_from_first_active_until_booking is highly correlated with df_index and 5 other fields | High correlation |
days_from_account_created_until_first_booking is highly correlated with df_index and 4 other fields | High correlation |
day_first_booking is highly correlated with days_from_first_active_until_booking and 3 other fields | High correlation |
day_of_week_first_booking is highly correlated with days_from_first_active_until_booking and 2 other fields | High correlation |
year_account_created is highly correlated with df_index and 4 other fields | High correlation |
day_account_created is highly correlated with day_first_booking | High correlation |
week_of _year_first_account_created is highly correlated with df_index and 3 other fields | High correlation |
days_from_first_active_until_account_created is highly skewed (γ1 = 55.3260477) | Skewed |
id is uniformly distributed | Uniform |
df_index has unique values | Unique |
id has unique values | Unique |
signup_flow has 98317 (80.2%) zeros | Zeros |
days_from_first_active_until_booking has 14839 (12.1%) zeros | Zeros |
days_from_first_active_until_account_created has 122482 (99.9%) zeros | Zeros |
days_from_account_created_until_first_booking has 14842 (12.1%) zeros | Zeros |
day_of_week_first_booking has 64567 (52.6%) zeros | Zeros |
day_of _week_first_account_created has 18949 (15.5%) zeros | Zeros |
Reproduction
| Analysis started | 2022-07-23 12:35:02.694881 |
|---|---|
| Analysis finished | 2022-07-23 12:36:18.163691 |
| Duration | 1 minute and 15.47 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
df_index
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIQUE| Distinct | 122636 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 104123.6797 |
| Minimum | 1 |
|---|---|
| Maximum | 213448 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 11367.5 |
| Q1 | 50493.5 |
| median | 100991.5 |
| Q3 | 158162.25 |
| 95-th percentile | 202360.25 |
| Maximum | 213448 |
| Range | 213447 |
| Interquartile range (IQR) | 107668.75 |
Descriptive statistics
| Standard deviation | 61722.63315 |
|---|---|
| Coefficient of variation (CV) | 0.5927819042 |
| Kurtosis | -1.218367405 |
| Mean | 104123.6797 |
| Median Absolute Deviation (MAD) | 53548 |
| Skewness | 0.07863645755 |
| Sum | 1.276931159 × 1010 |
| Variance | 3809683443 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 139171 | 1 | < 0.1% |
| 139197 | 1 | < 0.1% |
| 139196 | 1 | < 0.1% |
| 139193 | 1 | < 0.1% |
| 139192 | 1 | < 0.1% |
| 139191 | 1 | < 0.1% |
| 139190 | 1 | < 0.1% |
| 139188 | 1 | < 0.1% |
| 139186 | 1 | < 0.1% |
| Other values (122626) | 122626 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 |
| Value | Count | Frequency (%) |
| 213448 | 1 | |
| 213446 | 1 | |
| 213445 | 1 | |
| 213443 | 1 | |
| 213441 | 1 | |
| 213440 | 1 | |
| 213439 | 1 | |
| 213432 | 1 | |
| 213430 | 1 | |
| 213425 | 1 |
| Distinct | 122636 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 958.2 KiB |
| 820tgsjxq7 | 1 |
|---|---|
| upghc731x8 | 1 |
| 2ajt2i0cwf | 1 |
| dpsojmdsqa | 1 |
| aennz2le8b | 1 |
| Other values (122631) |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 1226360 |
|---|---|
| Distinct characters | 36 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 122636 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | 820tgsjxq7 |
|---|---|
| 2nd row | 4ft3gnwmtx |
| 3rd row | bjjt8pjhuk |
| 4th row | 87mebub9p4 |
| 5th row | lsw9q7uk0j |
Common Values
| Value | Count | Frequency (%) |
| 820tgsjxq7 | 1 | < 0.1% |
| upghc731x8 | 1 | < 0.1% |
| 2ajt2i0cwf | 1 | < 0.1% |
| dpsojmdsqa | 1 | < 0.1% |
| aennz2le8b | 1 | < 0.1% |
| facfeibe4t | 1 | < 0.1% |
| t0hkswek7a | 1 | < 0.1% |
| z329kp10k2 | 1 | < 0.1% |
| pz9eb42i12 | 1 | < 0.1% |
| fqdedvewn8 | 1 | < 0.1% |
| Other values (122626) | 122626 |
Length
| Value | Count | Frequency (%) |
| 820tgsjxq7 | 1 | < 0.1% |
| 7i49vnuav6 | 1 | < 0.1% |
| lsw9q7uk0j | 1 | < 0.1% |
| 0d01nltbrs | 1 | < 0.1% |
| a1vcnhxeij | 1 | < 0.1% |
| 6uh8zyj2gn | 1 | < 0.1% |
| yuuqmid2rp | 1 | < 0.1% |
| om1ss59ys8 | 1 | < 0.1% |
| dy3rgx56cu | 1 | < 0.1% |
| ju3h98ch3w | 1 | < 0.1% |
| Other values (122626) | 122626 |
Most occurring characters
| Value | Count | Frequency (%) |
| h | 34387 | 2.8% |
| y | 34362 | 2.8% |
| l | 34304 | 2.8% |
| t | 34296 | 2.8% |
| a | 34266 | 2.8% |
| k | 34260 | 2.8% |
| f | 34258 | 2.8% |
| 4 | 34219 | 2.8% |
| 9 | 34152 | 2.8% |
| m | 34134 | 2.8% |
| Other values (26) | 883722 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 885988 | |
| Decimal Number | 340372 | 27.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| h | 34387 | 3.9% |
| y | 34362 | 3.9% |
| l | 34304 | 3.9% |
| t | 34296 | 3.9% |
| a | 34266 | 3.9% |
| k | 34260 | 3.9% |
| f | 34258 | 3.9% |
| m | 34134 | 3.9% |
| x | 34133 | 3.9% |
| d | 34117 | 3.9% |
| Other values (16) | 543471 |
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 34219 | |
| 9 | 34152 | |
| 3 | 34085 | |
| 0 | 34054 | |
| 7 | 34044 | |
| 5 | 34031 | |
| 8 | 34026 | |
| 1 | 34022 | |
| 2 | 34010 | |
| 6 | 33729 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 885988 | |
| Common | 340372 | 27.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| h | 34387 | 3.9% |
| y | 34362 | 3.9% |
| l | 34304 | 3.9% |
| t | 34296 | 3.9% |
| a | 34266 | 3.9% |
| k | 34260 | 3.9% |
| f | 34258 | 3.9% |
| m | 34134 | 3.9% |
| x | 34133 | 3.9% |
| d | 34117 | 3.9% |
| Other values (16) | 543471 |
Common
| Value | Count | Frequency (%) |
| 4 | 34219 | |
| 9 | 34152 | |
| 3 | 34085 | |
| 0 | 34054 | |
| 7 | 34044 | |
| 5 | 34031 | |
| 8 | 34026 | |
| 1 | 34022 | |
| 2 | 34010 | |
| 6 | 33729 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1226360 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| h | 34387 | 2.8% |
| y | 34362 | 2.8% |
| l | 34304 | 2.8% |
| t | 34296 | 2.8% |
| a | 34266 | 2.8% |
| k | 34260 | 2.8% |
| f | 34258 | 2.8% |
| 4 | 34219 | 2.8% |
| 9 | 34152 | 2.8% |
| m | 34134 | 2.8% |
| Other values (26) | 883722 |
gender
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 958.2 KiB |
| FEMALE | |
|---|---|
| MALE | |
| -unknown- | |
| OTHER | 225 |
Length
| Max length | 9 |
|---|---|
| Median length | 6 |
| Mean length | 5.596382791 |
| Min length | 4 |
Characters and Unicode
| Total characters | 686318 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | MALE |
|---|---|
| 2nd row | FEMALE |
| 3rd row | FEMALE |
| 4th row | -unknown- |
| 5th row | FEMALE |
Common Values
| Value | Count | Frequency (%) |
| FEMALE | 56362 | |
| MALE | 49484 | |
| -unknown- | 16565 | 13.5% |
| OTHER | 225 | 0.2% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| female | 56362 | |
| male | 49484 | |
| unknown | 16565 | 13.5% |
| other | 225 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 162433 | |
| M | 105846 | |
| A | 105846 | |
| L | 105846 | |
| F | 56362 | 8.2% |
| n | 49695 | 7.2% |
| - | 33130 | 4.8% |
| u | 16565 | 2.4% |
| k | 16565 | 2.4% |
| o | 16565 | 2.4% |
| Other values (5) | 17465 | 2.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 537233 | |
| Lowercase Letter | 115955 | 16.9% |
| Dash Punctuation | 33130 | 4.8% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| E | 162433 | |
| M | 105846 | |
| A | 105846 | |
| L | 105846 | |
| F | 56362 | 10.5% |
| O | 225 | < 0.1% |
| T | 225 | < 0.1% |
| H | 225 | < 0.1% |
| R | 225 | < 0.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 49695 | |
| u | 16565 | 14.3% |
| k | 16565 | 14.3% |
| o | 16565 | 14.3% |
| w | 16565 | 14.3% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 33130 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 653188 | |
| Common | 33130 | 4.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| E | 162433 | |
| M | 105846 | |
| A | 105846 | |
| L | 105846 | |
| F | 56362 | 8.6% |
| n | 49695 | 7.6% |
| u | 16565 | 2.5% |
| k | 16565 | 2.5% |
| o | 16565 | 2.5% |
| w | 16565 | 2.5% |
| Other values (4) | 900 | 0.1% |
Common
| Value | Count | Frequency (%) |
| - | 33130 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 686318 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| E | 162433 | |
| M | 105846 | |
| A | 105846 | |
| L | 105846 | |
| F | 56362 | 8.2% |
| n | 49695 | 7.2% |
| - | 33130 | 4.8% |
| u | 16565 | 2.4% |
| k | 16565 | 2.4% |
| o | 16565 | 2.4% |
| Other values (5) | 17465 | 2.5% |
age
Real number (ℝ≥0)
| Distinct | 99 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 37.40559053 |
| Minimum | 16 |
|---|---|
| Maximum | 115 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 16 |
|---|---|
| 5-th percentile | 23 |
| Q1 | 28 |
| median | 34 |
| Q3 | 43 |
| 95-th percentile | 63 |
| Maximum | 115 |
| Range | 99 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 13.93990034 |
|---|---|
| Coefficient of variation (CV) | 0.3726689016 |
| Kurtosis | 6.51646808 |
| Mean | 37.40559053 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | 2.089718287 |
| Sum | 4587272 |
| Variance | 194.3208214 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 30 | 6039 | 4.9% |
| 31 | 5935 | 4.8% |
| 29 | 5894 | 4.8% |
| 28 | 5862 | 4.8% |
| 32 | 5763 | 4.7% |
| 27 | 5671 | 4.6% |
| 33 | 5455 | 4.4% |
| 26 | 4960 | 4.0% |
| 34 | 4940 | 4.0% |
| 35 | 4777 | 3.9% |
| Other values (89) | 67340 |
| Value | Count | Frequency (%) |
| 16 | 26 | < 0.1% |
| 17 | 64 | 0.1% |
| 18 | 665 | 0.5% |
| 19 | 1097 | 0.9% |
| 20 | 533 | 0.4% |
| 21 | 969 | 0.8% |
| 22 | 1679 | 1.4% |
| 23 | 2424 | |
| 24 | 3173 | |
| 25 | 4405 |
| Value | Count | Frequency (%) |
| 115 | 12 | < 0.1% |
| 113 | 4 | < 0.1% |
| 112 | 1 | < 0.1% |
| 111 | 2 | < 0.1% |
| 110 | 188 | 0.2% |
| 109 | 31 | < 0.1% |
| 108 | 15 | < 0.1% |
| 107 | 23 | < 0.1% |
| 106 | 17 | < 0.1% |
| 105 | 1127 |
signup_method
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 958.2 KiB |
| basic | |
|---|---|
| 141 |
Length
| Max length | 8 |
|---|---|
| Median length | 5 |
| Mean length | 6.382212401 |
| Min length | 5 |
Characters and Unicode
| Total characters | 782689 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | |
|---|---|
| 2nd row | basic |
| 3rd row | |
| 4th row | basic |
| 5th row | basic |
Common Values
| Value | Count | Frequency (%) |
| basic | 66039 | |
| 56456 | ||
| 141 | 0.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| basic | 66039 | |
| 56456 | ||
| 141 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| b | 122495 | |
| a | 122495 | |
| c | 122495 | |
| o | 113194 | |
| s | 66039 | |
| i | 66039 | |
| e | 56597 | |
| f | 56456 | |
| k | 56456 | |
| g | 282 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 782689 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| b | 122495 | |
| a | 122495 | |
| c | 122495 | |
| o | 113194 | |
| s | 66039 | |
| i | 66039 | |
| e | 56597 | |
| f | 56456 | |
| k | 56456 | |
| g | 282 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 782689 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| b | 122495 | |
| a | 122495 | |
| c | 122495 | |
| o | 113194 | |
| s | 66039 | |
| i | 66039 | |
| e | 56597 | |
| f | 56456 | |
| k | 56456 | |
| g | 282 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 782689 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| b | 122495 | |
| a | 122495 | |
| c | 122495 | |
| o | 113194 | |
| s | 66039 | |
| i | 66039 | |
| e | 56597 | |
| f | 56456 | |
| k | 56456 | |
| g | 282 | < 0.1% |
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.51951303 |
| Minimum | 0 |
|---|---|
| Maximum | 25 |
| Zeros | 98317 |
| Zeros (%) | 80.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 24 |
| Maximum | 25 |
| Range | 25 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 6.604722668 |
|---|---|
| Coefficient of variation (CV) | 2.621428263 |
| Kurtosis | 5.928297912 |
| Mean | 2.51951303 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.705870564 |
| Sum | 308983 |
| Variance | 43.62236152 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 98317 | |
| 12 | 5996 | 4.9% |
| 25 | 5845 | 4.8% |
| 3 | 5035 | 4.1% |
| 2 | 3823 | 3.1% |
| 24 | 1596 | 1.3% |
| 23 | 993 | 0.8% |
| 1 | 509 | 0.4% |
| 21 | 195 | 0.2% |
| 8 | 142 | 0.1% |
| Other values (7) | 185 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 98317 | |
| 1 | 509 | 0.4% |
| 2 | 3823 | 3.1% |
| 3 | 5035 | 4.1% |
| 4 | 1 | < 0.1% |
| 5 | 27 | < 0.1% |
| 6 | 139 | 0.1% |
| 8 | 142 | 0.1% |
| 10 | 1 | < 0.1% |
| 12 | 5996 | 4.9% |
| Value | Count | Frequency (%) |
| 25 | 5845 | |
| 24 | 1596 | 1.3% |
| 23 | 993 | 0.8% |
| 21 | 195 | 0.2% |
| 20 | 5 | < 0.1% |
| 16 | 9 | < 0.1% |
| 15 | 3 | < 0.1% |
| 12 | 5996 | |
| 10 | 1 | < 0.1% |
| 8 | 142 | 0.1% |
language
Categorical
| Distinct | 25 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 958.2 KiB |
| en | |
|---|---|
| zh | 901 |
| fr | 807 |
| es | 625 |
| de | 407 |
| Other values (20) | 1691 |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 245272 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | en |
|---|---|
| 2nd row | en |
| 3rd row | en |
| 4th row | en |
| 5th row | en |
Common Values
| Value | Count | Frequency (%) |
| en | 118205 | |
| zh | 901 | 0.7% |
| fr | 807 | 0.7% |
| es | 625 | 0.5% |
| de | 407 | 0.3% |
| ko | 395 | 0.3% |
| it | 347 | 0.3% |
| ru | 269 | 0.2% |
| pt | 169 | 0.1% |
| ja | 130 | 0.1% |
| Other values (15) | 381 | 0.3% |
Length
| Value | Count | Frequency (%) |
| en | 118205 | |
| zh | 901 | 0.7% |
| fr | 807 | 0.7% |
| es | 625 | 0.5% |
| de | 407 | 0.3% |
| ko | 395 | 0.3% |
| it | 347 | 0.3% |
| ru | 269 | 0.2% |
| pt | 169 | 0.1% |
| ja | 130 | 0.1% |
| Other values (15) | 381 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 119259 | |
| n | 118277 | |
| r | 1123 | 0.5% |
| h | 935 | 0.4% |
| z | 901 | 0.4% |
| f | 818 | 0.3% |
| s | 726 | 0.3% |
| t | 578 | 0.2% |
| d | 457 | 0.2% |
| o | 415 | 0.2% |
| Other values (9) | 1783 | 0.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 245272 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 119259 | |
| n | 118277 | |
| r | 1123 | 0.5% |
| h | 935 | 0.4% |
| z | 901 | 0.4% |
| f | 818 | 0.3% |
| s | 726 | 0.3% |
| t | 578 | 0.2% |
| d | 457 | 0.2% |
| o | 415 | 0.2% |
| Other values (9) | 1783 | 0.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 245272 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 119259 | |
| n | 118277 | |
| r | 1123 | 0.5% |
| h | 935 | 0.4% |
| z | 901 | 0.4% |
| f | 818 | 0.3% |
| s | 726 | 0.3% |
| t | 578 | 0.2% |
| d | 457 | 0.2% |
| o | 415 | 0.2% |
| Other values (9) | 1783 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 245272 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 119259 | |
| n | 118277 | |
| r | 1123 | 0.5% |
| h | 935 | 0.4% |
| z | 901 | 0.4% |
| f | 818 | 0.3% |
| s | 726 | 0.3% |
| t | 578 | 0.2% |
| d | 457 | 0.2% |
| o | 415 | 0.2% |
| Other values (9) | 1783 | 0.7% |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 958.2 KiB |
| direct | |
|---|---|
| sem-brand | |
| sem-non-brand | |
| other | 5357 |
| seo | 5288 |
| Other values (3) | 7874 |
Length
| Max length | 13 |
|---|---|
| Median length | 6 |
| Mean length | 6.666109462 |
| Min length | 3 |
Characters and Unicode
| Total characters | 817505 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | seo |
|---|---|
| 2nd row | direct |
| 3rd row | direct |
| 4th row | direct |
| 5th row | other |
Common Values
| Value | Count | Frequency (%) |
| direct | 79093 | |
| sem-brand | 15347 | 12.5% |
| sem-non-brand | 9677 | 7.9% |
| other | 5357 | 4.4% |
| seo | 5288 | 4.3% |
| api | 5280 | 4.3% |
| content | 2000 | 1.6% |
| remarketing | 594 | 0.5% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| direct | 79093 | |
| sem-brand | 15347 | 12.5% |
| sem-non-brand | 9677 | 7.9% |
| other | 5357 | 4.4% |
| seo | 5288 | 4.3% |
| api | 5280 | 4.3% |
| content | 2000 | 1.6% |
| remarketing | 594 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 117950 | |
| r | 110662 | |
| d | 104117 | |
| t | 89044 | |
| i | 84967 | |
| c | 81093 | |
| n | 48972 | |
| - | 34701 | 4.2% |
| a | 30898 | 3.8% |
| s | 30312 | 3.7% |
| Other values (7) | 84789 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 782804 | |
| Dash Punctuation | 34701 | 4.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 117950 | |
| r | 110662 | |
| d | 104117 | |
| t | 89044 | |
| i | 84967 | |
| c | 81093 | |
| n | 48972 | |
| a | 30898 | 3.9% |
| s | 30312 | 3.9% |
| m | 25618 | 3.3% |
| Other values (6) | 59171 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 34701 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 782804 | |
| Common | 34701 | 4.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 117950 | |
| r | 110662 | |
| d | 104117 | |
| t | 89044 | |
| i | 84967 | |
| c | 81093 | |
| n | 48972 | |
| a | 30898 | 3.9% |
| s | 30312 | 3.9% |
| m | 25618 | 3.3% |
| Other values (6) | 59171 |
Common
| Value | Count | Frequency (%) |
| - | 34701 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 817505 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 117950 | |
| r | 110662 | |
| d | 104117 | |
| t | 89044 | |
| i | 84967 | |
| c | 81093 | |
| n | 48972 | |
| - | 34701 | 4.2% |
| a | 30898 | 3.8% |
| s | 30312 | 3.7% |
| Other values (7) | 84789 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 958.2 KiB |
| untracked | |
|---|---|
| linked | |
| omg | |
| tracked-other | 3834 |
| product | 813 |
| Other values (2) | 128 |
Length
| Max length | 13 |
|---|---|
| Median length | 9 |
| Mean length | 7.203366059 |
| Min length | 3 |
Characters and Unicode
| Total characters | 883392 |
|---|---|
| Distinct characters | 19 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | untracked |
|---|---|
| 2nd row | untracked |
| 3rd row | untracked |
| 4th row | untracked |
| 5th row | untracked |
Common Values
| Value | Count | Frequency (%) |
| untracked | 64712 | |
| linked | 28284 | |
| omg | 24865 | 20.3% |
| tracked-other | 3834 | 3.1% |
| product | 813 | 0.7% |
| marketing | 101 | 0.1% |
| local ops | 27 | < 0.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| untracked | 64712 | |
| linked | 28284 | |
| omg | 24865 | 20.3% |
| tracked-other | 3834 | 3.1% |
| product | 813 | 0.7% |
| marketing | 101 | 0.1% |
| local | 27 | < 0.1% |
| ops | 27 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 100765 | |
| d | 97643 | |
| k | 96931 | |
| n | 93097 | |
| t | 73294 | |
| r | 73294 | |
| c | 69386 | |
| a | 68674 | |
| u | 65525 | |
| o | 29566 | 3.3% |
| Other values (9) | 115217 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 879531 | |
| Dash Punctuation | 3834 | 0.4% |
| Space Separator | 27 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 100765 | |
| d | 97643 | |
| k | 96931 | |
| n | 93097 | |
| t | 73294 | |
| r | 73294 | |
| c | 69386 | |
| a | 68674 | |
| u | 65525 | |
| o | 29566 | 3.4% |
| Other values (7) | 111356 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 3834 |
Space Separator
| Value | Count | Frequency (%) |
| 27 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 879531 | |
| Common | 3861 | 0.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 100765 | |
| d | 97643 | |
| k | 96931 | |
| n | 93097 | |
| t | 73294 | |
| r | 73294 | |
| c | 69386 | |
| a | 68674 | |
| u | 65525 | |
| o | 29566 | 3.4% |
| Other values (7) | 111356 |
Common
| Value | Count | Frequency (%) |
| - | 3834 | |
| 27 | 0.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 883392 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 100765 | |
| d | 97643 | |
| k | 96931 | |
| n | 93097 | |
| t | 73294 | |
| r | 73294 | |
| c | 69386 | |
| a | 68674 | |
| u | 65525 | |
| o | 29566 | 3.3% |
| Other values (9) | 115217 |
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 958.2 KiB |
| Web | |
|---|---|
| iOS | 9689 |
| Moweb | 2364 |
| Android | 2300 |
Length
| Max length | 7 |
|---|---|
| Median length | 3 |
| Mean length | 3.113571871 |
| Min length | 3 |
Characters and Unicode
| Total characters | 381836 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Web |
|---|---|
| 2nd row | Web |
| 3rd row | Web |
| 4th row | Web |
| 5th row | Web |
Common Values
| Value | Count | Frequency (%) |
| Web | 108283 | |
| iOS | 9689 | 7.9% |
| Moweb | 2364 | 1.9% |
| Android | 2300 | 1.9% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| web | 108283 | |
| ios | 9689 | 7.9% |
| moweb | 2364 | 1.9% |
| android | 2300 | 1.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 110647 | |
| b | 110647 | |
| W | 108283 | |
| i | 11989 | 3.1% |
| O | 9689 | 2.5% |
| S | 9689 | 2.5% |
| o | 4664 | 1.2% |
| d | 4600 | 1.2% |
| M | 2364 | 0.6% |
| w | 2364 | 0.6% |
| Other values (3) | 6900 | 1.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 249511 | |
| Uppercase Letter | 132325 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 110647 | |
| b | 110647 | |
| i | 11989 | 4.8% |
| o | 4664 | 1.9% |
| d | 4600 | 1.8% |
| w | 2364 | 0.9% |
| n | 2300 | 0.9% |
| r | 2300 | 0.9% |
Uppercase Letter
| Value | Count | Frequency (%) |
| W | 108283 | |
| O | 9689 | 7.3% |
| S | 9689 | 7.3% |
| M | 2364 | 1.8% |
| A | 2300 | 1.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 381836 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 110647 | |
| b | 110647 | |
| W | 108283 | |
| i | 11989 | 3.1% |
| O | 9689 | 2.5% |
| S | 9689 | 2.5% |
| o | 4664 | 1.2% |
| d | 4600 | 1.2% |
| M | 2364 | 0.6% |
| w | 2364 | 0.6% |
| Other values (3) | 6900 | 1.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 381836 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 110647 | |
| b | 110647 | |
| W | 108283 | |
| i | 11989 | 3.1% |
| O | 9689 | 2.5% |
| S | 9689 | 2.5% |
| o | 4664 | 1.2% |
| d | 4600 | 1.2% |
| M | 2364 | 0.6% |
| w | 2364 | 0.6% |
| Other values (3) | 6900 | 1.8% |
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 958.2 KiB |
| NDF | |
|---|---|
| US | |
| CA | |
| AU | |
| DE | |
| Other values (7) |
Length
| Max length | 5 |
|---|---|
| Median length | 2 |
| Mean length | 2.372411038 |
| Min length | 2 |
Characters and Unicode
| Total characters | 290943 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | US |
|---|---|
| 2nd row | other |
| 3rd row | US |
| 4th row | US |
| 5th row | US |
Common Values
| Value | Count | Frequency (%) |
| NDF | 32555 | |
| US | 28208 | |
| CA | 23632 | |
| AU | 22372 | |
| DE | 5696 | 4.6% |
| other | 4372 | 3.6% |
| FR | 2130 | 1.7% |
| IT | 1228 | 1.0% |
| GB | 1019 | 0.8% |
| ES | 992 | 0.8% |
| Other values (2) | 432 | 0.4% |
Length
| Value | Count | Frequency (%) |
| ndf | 32555 | |
| us | 28208 | |
| ca | 23632 | |
| au | 22372 | |
| de | 5696 | 4.6% |
| other | 4372 | 3.6% |
| fr | 2130 | 1.7% |
| it | 1228 | 1.0% |
| gb | 1019 | 0.8% |
| es | 992 | 0.8% |
| Other values (2) | 432 | 0.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| U | 50580 | |
| A | 46004 | |
| D | 38251 | |
| F | 34685 | |
| N | 32896 | |
| S | 29200 | |
| C | 23632 | |
| E | 6688 | 2.3% |
| r | 4372 | 1.5% |
| e | 4372 | 1.5% |
| Other values (10) | 20263 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 269083 | |
| Lowercase Letter | 21860 | 7.5% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| U | 50580 | |
| A | 46004 | |
| D | 38251 | |
| F | 34685 | |
| N | 32896 | |
| S | 29200 | |
| C | 23632 | |
| E | 6688 | 2.5% |
| R | 2130 | 0.8% |
| T | 1319 | 0.5% |
| Other values (5) | 3698 | 1.4% |
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 4372 | |
| e | 4372 | |
| h | 4372 | |
| t | 4372 | |
| o | 4372 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 290943 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| U | 50580 | |
| A | 46004 | |
| D | 38251 | |
| F | 34685 | |
| N | 32896 | |
| S | 29200 | |
| C | 23632 | |
| E | 6688 | 2.3% |
| r | 4372 | 1.5% |
| e | 4372 | 1.5% |
| Other values (10) | 20263 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 290943 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| U | 50580 | |
| A | 46004 | |
| D | 38251 | |
| F | 34685 | |
| N | 32896 | |
| S | 29200 | |
| C | 23632 | |
| E | 6688 | 2.3% |
| r | 4372 | 1.5% |
| e | 4372 | 1.5% |
| Other values (10) | 20263 |
days_from_first_active_until_booking
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 1853 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 356.2222104 |
| Minimum | 0 |
|---|---|
| Maximum | 2228 |
| Zeros | 14839 |
| Zeros (%) | 12.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 222 |
| Q3 | 625 |
| 95-th percentile | 1124 |
| Maximum | 2228 |
| Range | 2228 |
| Interquartile range (IQR) | 622 |
Descriptive statistics
| Standard deviation | 398.6656183 |
|---|---|
| Coefficient of variation (CV) | 1.119148685 |
| Kurtosis | 0.02666812209 |
| Mean | 356.2222104 |
| Median Absolute Deviation (MAD) | 221 |
| Skewness | 0.9379958696 |
| Sum | 43685667 |
| Variance | 158934.2752 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 14839 | 12.1% |
| 1 | 10592 | 8.6% |
| 2 | 4795 | 3.9% |
| 3 | 2956 | 2.4% |
| 4 | 2183 | 1.8% |
| 5 | 1694 | 1.4% |
| 6 | 1328 | 1.1% |
| 7 | 1237 | 1.0% |
| 8 | 992 | 0.8% |
| 9 | 812 | 0.7% |
| Other values (1843) | 81208 |
| Value | Count | Frequency (%) |
| 0 | 14839 | |
| 1 | 10592 | |
| 2 | 4795 | 3.9% |
| 3 | 2956 | 2.4% |
| 4 | 2183 | 1.8% |
| 5 | 1694 | 1.4% |
| 6 | 1328 | 1.1% |
| 7 | 1237 | 1.0% |
| 8 | 992 | 0.8% |
| 9 | 812 | 0.7% |
| Value | Count | Frequency (%) |
| 2228 | 1 | |
| 2001 | 2 | |
| 1999 | 1 | |
| 1995 | 1 | |
| 1992 | 1 | |
| 1991 | 2 | |
| 1990 | 2 | |
| 1980 | 1 | |
| 1979 | 1 | |
| 1977 | 1 |
| Distinct | 132 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.3723947291 |
| Minimum | 0 |
|---|---|
| Maximum | 1456 |
| Zeros | 122482 |
| Zeros (%) | 99.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 1456 |
| Range | 1456 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 15.17748441 |
|---|---|
| Coefficient of variation (CV) | 40.75644262 |
| Kurtosis | 3655.645933 |
| Mean | 0.3723947291 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 55.3260477 |
| Sum | 45669 |
| Variance | 230.356033 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 122482 | |
| 1 | 4 | < 0.1% |
| 6 | 4 | < 0.1% |
| 3 | 3 | < 0.1% |
| 29 | 3 | < 0.1% |
| 634 | 2 | < 0.1% |
| 103 | 2 | < 0.1% |
| 4 | 2 | < 0.1% |
| 20 | 2 | < 0.1% |
| 722 | 2 | < 0.1% |
| Other values (122) | 130 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 122482 | |
| 1 | 4 | < 0.1% |
| 2 | 1 | < 0.1% |
| 3 | 3 | < 0.1% |
| 4 | 2 | < 0.1% |
| 5 | 2 | < 0.1% |
| 6 | 4 | < 0.1% |
| 7 | 1 | < 0.1% |
| 9 | 2 | < 0.1% |
| 10 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1456 | 1 | |
| 1369 | 1 | |
| 1361 | 1 | |
| 1148 | 1 | |
| 1036 | 1 | |
| 1018 | 1 | |
| 1011 | 1 | |
| 995 | 1 | |
| 882 | 1 | |
| 851 | 1 |
days_from_account_created_until_first_booking
Real number (ℝ)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 1873 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 355.8498157 |
| Minimum | -349 |
|---|---|
| Maximum | 2001 |
| Zeros | 14842 |
| Zeros (%) | 12.1% |
| Negative | 25 |
| Negative (%) | < 0.1% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | -349 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 221 |
| Q3 | 624 |
| 95-th percentile | 1123 |
| Maximum | 2001 |
| Range | 2350 |
| Interquartile range (IQR) | 621 |
Descriptive statistics
| Standard deviation | 398.4638995 |
|---|---|
| Coefficient of variation (CV) | 1.119753002 |
| Kurtosis | 0.02362103566 |
| Mean | 355.8498157 |
| Median Absolute Deviation (MAD) | 220 |
| Skewness | 0.9374490797 |
| Sum | 43639998 |
| Variance | 158773.4792 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 14842 | 12.1% |
| 1 | 10592 | 8.6% |
| 2 | 4799 | 3.9% |
| 3 | 2958 | 2.4% |
| 4 | 2183 | 1.8% |
| 5 | 1698 | 1.4% |
| 6 | 1331 | 1.1% |
| 7 | 1237 | 1.0% |
| 8 | 993 | 0.8% |
| 9 | 810 | 0.7% |
| Other values (1863) | 81193 |
| Value | Count | Frequency (%) |
| -349 | 1 | |
| -347 | 1 | |
| -338 | 1 | |
| -308 | 1 | |
| -298 | 1 | |
| -295 | 1 | |
| -269 | 1 | |
| -261 | 1 | |
| -208 | 1 | |
| -140 | 1 |
| Value | Count | Frequency (%) |
| 2001 | 2 | |
| 1999 | 1 | |
| 1995 | 1 | |
| 1992 | 1 | |
| 1991 | 2 | |
| 1990 | 2 | |
| 1980 | 1 | |
| 1979 | 1 | |
| 1977 | 1 | |
| 1976 | 1 |
day_first_booking
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 21.61555334 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 14 |
| median | 28 |
| Q3 | 29 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 9.283882799 |
|---|---|
| Coefficient of variation (CV) | 0.429500122 |
| Kurtosis | -0.754697507 |
| Mean | 21.61555334 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.853826875 |
| Sum | 2650845 |
| Variance | 86.19047983 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 29 | 56857 | |
| 16 | 2344 | 1.9% |
| 17 | 2328 | 1.9% |
| 11 | 2321 | 1.9% |
| 10 | 2318 | 1.9% |
| 13 | 2313 | 1.9% |
| 15 | 2300 | 1.9% |
| 3 | 2291 | 1.9% |
| 5 | 2284 | 1.9% |
| 12 | 2280 | 1.9% |
| Other values (21) | 45000 |
| Value | Count | Frequency (%) |
| 1 | 2104 | |
| 2 | 2189 | |
| 3 | 2291 | |
| 4 | 2161 | |
| 5 | 2284 | |
| 6 | 2228 | |
| 7 | 2259 | |
| 8 | 2272 | |
| 9 | 2212 | |
| 10 | 2318 |
| Value | Count | Frequency (%) |
| 31 | 1194 | 1.0% |
| 30 | 2055 | 1.7% |
| 29 | 56857 | |
| 28 | 2157 | 1.8% |
| 27 | 2098 | 1.7% |
| 26 | 2149 | 1.8% |
| 25 | 2195 | 1.8% |
| 24 | 2219 | 1.8% |
| 23 | 2212 | 1.8% |
| 22 | 2234 | 1.8% |
day_of_week_first_booking
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.549520532 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 64567 |
| Zeros (%) | 52.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 3 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.992697276 |
|---|---|
| Coefficient of variation (CV) | 1.286008952 |
| Kurtosis | -0.4927185682 |
| Mean | 1.549520532 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.9556386108 |
| Sum | 190027 |
| Variance | 3.970842433 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 64567 | |
| 1 | 10913 | 8.9% |
| 2 | 10909 | 8.9% |
| 3 | 10610 | 8.7% |
| 4 | 10180 | 8.3% |
| 5 | 7996 | 6.5% |
| 6 | 7461 | 6.1% |
| Value | Count | Frequency (%) |
| 0 | 64567 | |
| 1 | 10913 | 8.9% |
| 2 | 10909 | 8.9% |
| 3 | 10610 | 8.7% |
| 4 | 10180 | 8.3% |
| 5 | 7996 | 6.5% |
| 6 | 7461 | 6.1% |
| Value | Count | Frequency (%) |
| 6 | 7461 | 6.1% |
| 5 | 7996 | 6.5% |
| 4 | 10180 | 8.3% |
| 3 | 10610 | 8.7% |
| 2 | 10909 | 8.9% |
| 1 | 10913 | 8.9% |
| 0 | 64567 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 958.2 KiB |
| 2013 | |
|---|---|
| 2014 | |
| 2012 | |
| 2011 | |
| 2010 | 1395 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 490544 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2011 |
|---|---|
| 2nd row | 2010 |
| 3rd row | 2011 |
| 4th row | 2010 |
| 5th row | 2010 |
Common Values
| Value | Count | Frequency (%) |
| 2013 | 47619 | |
| 2014 | 42059 | |
| 2012 | 24820 | |
| 2011 | 6743 | 5.5% |
| 2010 | 1395 | 1.1% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2013 | 47619 | |
| 2014 | 42059 | |
| 2012 | 24820 | |
| 2011 | 6743 | 5.5% |
| 2010 | 1395 | 1.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 147456 | |
| 1 | 129379 | |
| 0 | 124031 | |
| 3 | 47619 | 9.7% |
| 4 | 42059 | 8.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 490544 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 147456 | |
| 1 | 129379 | |
| 0 | 124031 | |
| 3 | 47619 | 9.7% |
| 4 | 42059 | 8.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 490544 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 147456 | |
| 1 | 129379 | |
| 0 | 124031 | |
| 3 | 47619 | 9.7% |
| 4 | 42059 | 8.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 490544 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 147456 | |
| 1 | 129379 | |
| 0 | 124031 | |
| 3 | 47619 | 9.7% |
| 4 | 42059 | 8.6% |
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.8485355 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 16 |
| Q3 | 23 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.747581973 |
|---|---|
| Coefficient of variation (CV) | 0.5519489149 |
| Kurtosis | -1.189826952 |
| Mean | 15.8485355 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | -0.005988955419 |
| Sum | 1943601 |
| Variance | 76.52019038 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 24 | 4191 | 3.4% |
| 16 | 4154 | 3.4% |
| 23 | 4140 | 3.4% |
| 20 | 4131 | 3.4% |
| 18 | 4129 | 3.4% |
| 28 | 4128 | 3.4% |
| 27 | 4116 | 3.4% |
| 12 | 4104 | 3.3% |
| 11 | 4095 | 3.3% |
| 10 | 4091 | 3.3% |
| Other values (21) | 81357 |
| Value | Count | Frequency (%) |
| 1 | 3510 | |
| 2 | 3980 | |
| 3 | 3977 | |
| 4 | 3972 | |
| 5 | 4045 | |
| 6 | 4039 | |
| 7 | 3886 | |
| 8 | 3971 | |
| 9 | 4057 | |
| 10 | 4091 |
| Value | Count | Frequency (%) |
| 31 | 2166 | |
| 30 | 3845 | |
| 29 | 3809 | |
| 28 | 4128 | |
| 27 | 4116 | |
| 26 | 4023 | |
| 25 | 3942 | |
| 24 | 4191 | |
| 23 | 4140 | |
| 22 | 4035 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.756841384 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 18949 |
| Zeros (%) | 15.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 4 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.943418264 |
|---|---|
| Coefficient of variation (CV) | 0.704943808 |
| Kurtosis | -1.148452726 |
| Mean | 2.756841384 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.1708455936 |
| Sum | 338088 |
| Variance | 3.776874547 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 20245 | |
| 2 | 19662 | |
| 0 | 18949 | |
| 3 | 18632 | |
| 4 | 17119 | |
| 5 | 14027 | |
| 6 | 14002 |
| Value | Count | Frequency (%) |
| 0 | 18949 | |
| 1 | 20245 | |
| 2 | 19662 | |
| 3 | 18632 | |
| 4 | 17119 | |
| 5 | 14027 | |
| 6 | 14002 |
| Value | Count | Frequency (%) |
| 6 | 14002 | |
| 5 | 14027 | |
| 4 | 17119 | |
| 3 | 18632 | |
| 2 | 19662 | |
| 1 | 20245 | |
| 0 | 18949 |
| Distinct | 53 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24.38924949 |
| Minimum | 1 |
|---|---|
| Maximum | 53 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 958.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 13 |
| median | 23 |
| Q3 | 36 |
| 95-th percentile | 49 |
| Maximum | 53 |
| Range | 52 |
| Interquartile range (IQR) | 23 |
Descriptive statistics
| Standard deviation | 14.0251999 |
|---|---|
| Coefficient of variation (CV) | 0.5750566415 |
| Kurtosis | -0.9463708598 |
| Mean | 24.38924949 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 0.2578035502 |
| Sum | 2991000 |
| Variance | 196.7062322 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 26 | 3982 | 3.2% |
| 25 | 3691 | 3.0% |
| 21 | 3635 | 3.0% |
| 24 | 3622 | 3.0% |
| 23 | 3584 | 2.9% |
| 20 | 3492 | 2.8% |
| 22 | 3331 | 2.7% |
| 18 | 3247 | 2.6% |
| 19 | 3245 | 2.6% |
| 17 | 3241 | 2.6% |
| Other values (43) | 87566 |
| Value | Count | Frequency (%) |
| 1 | 1919 | |
| 2 | 2286 | |
| 3 | 2451 | |
| 4 | 2243 | |
| 5 | 2286 | |
| 6 | 2375 | |
| 7 | 2282 | |
| 8 | 2375 | |
| 9 | 2552 | |
| 10 | 2514 |
| Value | Count | Frequency (%) |
| 53 | 2 | < 0.1% |
| 52 | 1655 | |
| 51 | 1714 | |
| 50 | 1754 | |
| 49 | 1973 | |
| 48 | 1739 | |
| 47 | 1765 | |
| 46 | 1858 | |
| 45 | 1827 | |
| 44 | 1631 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| df_index | id | gender | age | signup_method | signup_flow | language | affiliate_channel | first_affiliate_tracked | signup_app | country_destination | days_from_first_active_until_booking | days_from_first_active_until_account_created | days_from_account_created_until_first_booking | day_first_booking | day_of_week_first_booking | year_account_created | day_account_created | day_of _week_first_account_created | week_of _year_first_account_created | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 820tgsjxq7 | MALE | 38 | 0 | en | seo | untracked | Web | US | 2228 | 732 | 1496 | 29 | 0 | 2011 | 25 | 2 | 21 | |
| 1 | 2 | 4ft3gnwmtx | FEMALE | 56 | basic | 3 | en | direct | untracked | Web | other | 419 | 476 | -57 | 2 | 0 | 2010 | 28 | 1 | 39 |
| 2 | 3 | bjjt8pjhuk | FEMALE | 42 | 0 | en | direct | untracked | Web | US | 1043 | 765 | 278 | 8 | 5 | 2011 | 5 | 0 | 49 | |
| 3 | 4 | 87mebub9p4 | -unknown- | 41 | basic | 0 | en | direct | untracked | Web | US | 72 | 280 | -208 | 18 | 3 | 2010 | 14 | 1 | 37 |
| 4 | 6 | lsw9q7uk0j | FEMALE | 46 | basic | 0 | en | other | untracked | Web | US | 3 | 0 | 3 | 5 | 1 | 2010 | 2 | 5 | 53 |
| 5 | 7 | 0d01nltbrs | FEMALE | 47 | basic | 0 | en | direct | omg | Web | US | 10 | 0 | 10 | 13 | 2 | 2010 | 3 | 6 | 53 |
| 6 | 8 | a1vcnhxeij | FEMALE | 50 | basic | 0 | en | other | untracked | Web | US | 206 | 0 | 206 | 29 | 3 | 2010 | 4 | 0 | 1 |
| 7 | 9 | 6uh8zyj2gn | -unknown- | 46 | basic | 0 | en | other | omg | Web | NDF | 0 | 0 | 0 | 4 | 0 | 2010 | 4 | 0 | 1 |
| 8 | 10 | yuuqmid2rp | FEMALE | 36 | basic | 0 | en | other | untracked | Web | NDF | 2 | 0 | 2 | 6 | 2 | 2010 | 4 | 0 | 1 |
| 9 | 11 | om1ss59ys8 | FEMALE | 47 | basic | 0 | en | other | untracked | Web | NDF | 2001 | 0 | 2001 | 29 | 0 | 2010 | 5 | 1 | 1 |
Last rows
| df_index | id | gender | age | signup_method | signup_flow | language | affiliate_channel | first_affiliate_tracked | signup_app | country_destination | days_from_first_active_until_booking | days_from_first_active_until_account_created | days_from_account_created_until_first_booking | day_first_booking | day_of_week_first_booking | year_account_created | day_account_created | day_of _week_first_account_created | week_of _year_first_account_created | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 122626 | 213425 | l1f71f9vsj | FEMALE | 30 | 0 | en | direct | linked | Web | DE | 364 | 0 | 364 | 29 | 0 | 2014 | 30 | 0 | 27 | |
| 122627 | 213430 | 79wk7k2k5t | -unknown- | 19 | basic | 0 | en | direct | linked | Web | DE | 364 | 0 | 364 | 29 | 0 | 2014 | 30 | 0 | 27 |
| 122628 | 213432 | rg7ayg1tob | MALE | 31 | 0 | en | direct | tracked-other | Web | DE | 364 | 0 | 364 | 29 | 0 | 2014 | 30 | 0 | 27 | |
| 122629 | 213439 | msucfwmlzc | MALE | 43 | basic | 0 | en | direct | untracked | Web | DE | 259 | 0 | 259 | 16 | 0 | 2014 | 30 | 0 | 27 |
| 122630 | 213440 | 04y8115avm | FEMALE | 24 | basic | 25 | en | direct | untracked | iOS | DE | 364 | 0 | 364 | 29 | 0 | 2014 | 30 | 0 | 27 |
| 122631 | 213441 | omlc9iku7t | FEMALE | 34 | basic | 0 | en | direct | linked | Web | DE | 44 | 0 | 44 | 13 | 2 | 2014 | 30 | 0 | 27 |
| 122632 | 213443 | 0k26r3mir0 | FEMALE | 36 | basic | 0 | en | sem-brand | linked | Web | DE | 13 | 0 | 13 | 13 | 6 | 2014 | 30 | 0 | 27 |
| 122633 | 213445 | qbxza0xojf | FEMALE | 23 | basic | 0 | en | sem-brand | omg | Web | DE | 2 | 0 | 2 | 2 | 2 | 2014 | 30 | 0 | 27 |
| 122634 | 213446 | zxodksqpep | MALE | 32 | basic | 0 | en | sem-brand | omg | Web | DE | 364 | 0 | 364 | 29 | 0 | 2014 | 30 | 0 | 27 |
| 122635 | 213448 | 6o3arsjbb4 | -unknown- | 32 | basic | 0 | en | direct | untracked | Web | DE | 364 | 0 | 364 | 29 | 0 | 2014 | 30 | 0 | 27 |